Querying XML documents with multi-dimensional markup

نویسنده

  • Peter Siniakov
چکیده

XML documents annotated by different NLP tools accommodate multidimensional markup in a single hierarchy. To query such documents one has to account for different possible nesting structures of the annotations and the original markup of a document. We propose an expressive pattern language with extended semantics of the sequence pattern, supporting negation, permutation and regular patterns that is especially appropriate for querying XML annotated documents with multi-dimensional markup. The concept of fuzzy matching allows matching of sequences that contain textual fragments and known XML elements independently of how concurrent annotations and original markup are merged. We extend the usual notion of sequence as a sequence of siblings allowing matching of sequence elements on the different levels of nesting and abstract so from the hierarchy of the XML document. Extended sequence semantics in combination with other language patterns allows more powerful and expressive queries than queries based on regular patterns.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study: Change Detection and Querying Dynamic XML Documents

The efficient management of the dynamic XML documents is a complex area of research. The changes and size of the XML documents throughout its lifetime are limitless.The increasing applications that use and exchange XML data is creating a demand for multi version support and store XML documents for future references. Change detection is an important part of version management to identify differe...

متن کامل

XPath Extension for Querying Concurrent XML Markup∗

XPath is a language for addressing parts of an XML document. It is used in many XML query languages and it can be used by itself for querying XML documents. While XPath is, in general, efficient for querying individual XML documents, it lacks the features for querying over collections of documents or joining parts of the same document. As the amount of complex document-centric XML data is conti...

متن کامل

Storing and Querying XML Documents Without Using Schema Information

As the popularity of eXtensible Markup Language (XML) continues to increase at an astonishing pace, data management systems for storing and querying large repositories of XML data are urgently needed. In this paper, we investigate using a Relational Database Management System (RDBMS) for storing and querying XML data. We present a mapping scheme, called PAID, for mapping XML documents to relati...

متن کامل

On Efficient Part-match Querying of XML Data

The XML language have been becoming de-facto a standard for representation of heterogeneous data in the Internet. From database point of view, XML is a new approach to data modelling. Implementation of a system enabling us to store and query XML documents efficiently (so called native XML databases) require a development of new techniques. The most of XML query languages are based on the langua...

متن کامل

A Type System for Querying XML Documents

In the last few years, the trend of publishing and sharing information on the World Wide Web caused much of the existing electronic data to lay outside of database management systems in the form of so-called Web documents. This process was further eased by the introduction of the eXtensible Markup Language (XML) by the World Wide Web Consortium (W3C) [1], which provided a standard format for We...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006